Skip to content

Conversation

aneubeck
Copy link
Collaborator

No description provided.

This is our base-line implementation.
Fastest results are achieved with 32 bytes per block which results in a 12.5% overhead.
64 bytes per block seems like a reasonable compromise between speed and memory overhead (6.25%).
…ss from scratch.

Instead, we check whether the access still belongs within the same word that we cache locally.
add quaternary parallel implementation (~50ns vs. ~80ns for binary).
@aneubeck aneubeck requested a review from a team as a code owner March 25, 2025 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant